Goto

Collaborating Authors

 definition 8



Psychometric Tests for AI Agents and Their Moduli Space

Chojecki, Przemyslaw

arXiv.org Artificial Intelligence

We develop a moduli-theoretic view of psychometric test batteries for AI agents and connect it explicitly to the AAI score developed previously. First, we make precise the notion of an AAI functional on a battery and set out axioms that any reasonable autonomy/general intelligence score should satisfy. Second, we show that the composite index ('AAI-Index') defined previously is a special case of our AAI functional. Third, we introduce the notion of a cognitive core of an agent relative to a battery and define the associated AAI$_{\textrm{core}}$ score as the restriction of an AAI functional to that core. Finally, we use these notions to describe invariants of batteries under evaluation-preserving symmetries and outline how moduli of equivalent batteries are organized.


A Gapped Scale-Sensitive Dimension and Lower Bounds for Offset Rademacher Complexity

Jia, Zeyu, Polyanskiy, Yury, Rakhlin, Alexander

arXiv.org Machine Learning

The celebrated Vapnik-Chervonenkis dimension vc(F) of a binary-valued function class F and the scale-sensitive dimension vc(F, α) of a real-valued function class F are central notions in the study of empirical processes and convergence of statistical learning methods [VC71, BLW94, KS94]. Sequential analogues of these notions--the Littlestone dimension ldim(F) and the sequential scale-sensitive dimension sfat(F, α)-- have been shown to play an analogously central role in the study of uniform martingale laws and online prediction [Lit88, BDPSS09, RST10]. In this paper, we study "gapped" versions of vc(F,α) and sfat(F, α). The modification yields a dimension that is no larger than the original one, yet can still be shown to control covering numbers in both sequential and non-sequential cases. More importantly, the new notion gives us a more precise control on the functions involved in "shattering" and thus yields non-vacuous lower bounds for offset Rademacher complexities for any uniformly bounded class--both in the classical and sequential cases--and, as a consequence, tighter lower bounds for online prediction problems, such as online regression or transductive learning. Our definition in the non-sequential case can also be seen as a modification of the Natarajan dimension [NT88, Nat89], and was, in fact, introduced in [AB00]. We first motivate the development in this paper on the simpler case of non-sequential data. We start by recalling the definition of the Vapnik-Chervonenkis dimension and its scale-sensitive version.



A pseudo-inverse of a line graph

Kandanaarachchi, Sevvandi, Kilby, Philip, Ong, Cheng Soon

arXiv.org Machine Learning

Graph perturbations are used to test robustness of algorithms. The expectation is that for small graph perturbations algorithm output should not change drastically. While graph perturbations are extensively studied in many contexts, they are underexplored for line graphs, where a line graph is an alternative representation of a graph obtained by mapping edges to vertices. But line graphs are increasingly used in many graph learning tasks including link prediction Cai et al. (2021), expressive GNNs Y ang & Huang (2024) and community detection Chen et al. (2019), and in other scientific disciplines Ruff et al. (2024), Min et al. (2023), Halldórsson et al. (2013). The reason that line graph perturbations are not commonly used is because the perturbed graph may not be a line graph. We introduce a pseudo-inverse of a line graph, which generalises the notion of the inverse line graph extending it to non-line graphs. The proposed pseudo-inverse is computed by minimally modifying the perturbed line graph so that it results in a line graph.


Bridging Pattern-Aware Complexity with NP-Hard Optimization: A Unifying Framework and Empirical Study

Saidi, Olivier

arXiv.org Artificial Intelligence

NP hard optimization problems like the Traveling Salesman Problem (TSP) defy efficient solutions in the worst case, yet real-world instances often exhibit exploitable patterns. We propose a novel patternaware complexity framework that quantifies and leverages structural regularities e.g., clustering, symmetry to reduce effective computational complexity across domains, including financial forecasting and LLM optimization. With rigorous definitions, theorems, and a meta learning driven solver pipeline, we introduce metrics like Pattern Utilization Efficiency (PUE) and achieve up to 79 percent solution quality gains in TSP benchmarks (22 to 2392 cities). Distinct from theoretical NP hardness, our approach offers a unified, practical lens for pattern-driven efficiency.


A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition

Li, Yuanpeng

arXiv.org Artificial Intelligence

Compositional generalization is a crucial property in artificial intelligence, enabling models to handle novel combinations of known components. While most deep learning models lack this capability, certain models succeed in specific tasks, suggesting the existence of governing conditions. This paper derives a necessary and sufficient condition for compositional generalization in neural networks. Conceptually, it requires that (i) the computational graph matches the true compositional structure, and (ii) components encode just enough information in training. The condition is supported by mathematical proofs. This criterion combines aspects of architecture design, regularization, and training data properties. A carefully designed minimal example illustrates an intuitive understanding of the condition. We also discuss the potential of the condition for assessing compositional generalization before training. This work is a fundamental theoretical study of compositional generalization in neural networks.


Beyond Static Assumptions: the Predictive Justified Perspective Model for Epistemic Planning

Li, Weijia, Hu, Guang, Xu, Yangmengfei

arXiv.org Artificial Intelligence

Epistemic Planning (EP) is an important research area dedicated to reasoning about the knowledge and beliefs of agents in multi-agent cooperative or adversarial settings. The Justified Perspective (JP) model is the state-of-the-art approach to solving EP problems with efficiency and expressiveness. However, all existing EP methods inherit the static environment assumption from classical planning. This limitation hinders the application of EP in fields such as robotics with multi-agent settings, where the environment contains changing variables. In this paper, we propose an extension of the JP model, namely, the Predictive Justified Perspective (PJP) model, to remove this assumption. Instead of assuming that beliefs remain unchanged since the last observation, the PJP model uses all past observations to form predictions about the changing variables. The definition of the prediction function with examples is provided, and it is demonstrated that it can work with arbitrary nesting. We then implemented the PJP model in several well-known domains and compared it with the JP model in the experiments. The results indicated that the PJP model performs exceptionally well across various domains, demonstrating its potential in improving EP applications in robotics.


Partial Identifiability and Misspecification in Inverse Reinforcement Learning

Skalse, Joar, Abate, Alessandro

arXiv.org Artificial Intelligence

The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function $R$ from a policy $\pi$. This problem is difficult, for several reasons. First of all, there are typically multiple reward functions which are compatible with a given policy; this means that the reward function is only *partially identifiable*, and that IRL contains a certain fundamental degree of ambiguity. Secondly, in order to infer $R$ from $\pi$, an IRL algorithm must have a *behavioural model* of how $\pi$ relates to $R$. However, the true relationship between human preferences and human behaviour is very complex, and practically impossible to fully capture with a simple model. This means that the behavioural model in practice will be *misspecified*, which raises the worry that it might lead to unsound inferences if applied to real-world data. In this paper, we provide a comprehensive mathematical analysis of partial identifiability and misspecification in IRL. Specifically, we fully characterise and quantify the ambiguity of the reward function for all of the behavioural models that are most common in the current IRL literature. We also provide necessary and sufficient conditions that describe precisely how the observed demonstrator policy may differ from each of the standard behavioural models before that model leads to faulty inferences about the reward function $R$. In addition to this, we introduce a cohesive framework for reasoning about partial identifiability and misspecification in IRL, together with several formal tools that can be used to easily derive the partial identifiability and misspecification robustness of new IRL models, or analyse other kinds of reward learning algorithms.


A Methodology for Gradual Semantics for Structured Argumentation under Incomplete Information

Rago, Antonio, Vasileiou, Stylianos Loukas, Toni, Francesca, Son, Tran Cao, Yeoh, William

arXiv.org Artificial Intelligence

Gradual semantics have demonstrated great potential in argumentation, in particular for deploying quantitative bipolar argumentation frameworks (QBAFs) in a number of real-world settings, from judgmental forecasting to explainable AI. In this paper, we provide a novel methodology for obtaining gradual semantics for structured argumentation frameworks, where the building blocks of arguments and relations between them are known, unlike in QBAFs, where arguments are abstract entities. Differently from existing approaches, our methodology accommodates incomplete information about arguments' premises. We demonstrate the potential of our approach by introducing two different instantiations of the methodology, leveraging existing gradual semantics for QBAFs in these more complex frameworks. We also define a set of novel properties for gradual semantics in structured argumentation, discuss their suitability over a set of existing properties. Finally, we provide a comprehensive theoretical analysis assessing the instantiations, demonstrating the their advantages over existing gradual semantics for QBAFs and structured argumentation.